Minimax rates for cost-sensitive learning on manifolds with approximate nearest neighbours
نویسندگان
چکیده
We study the approximate nearest neighbour method for cost-sensitive classification on low-dimensional manifolds embedded within a high-dimensional feature space. We determine the minimax learning rates for distributions on a smooth manifold, in a cost-sensitive setting. This generalises a classic result of Audibert and Tsybakov. Building upon recent work of Chaudhuri and Dasgupta we prove that these minimax rates are attained by the approximate nearest neighbour algorithm, where neighbours are computed in a randomly projected low-dimensional space. In addition, we give a bound on the number of dimensions required for the projection which depends solely upon the reach and dimension of the manifold, combined with the regularity of the marginal.
منابع مشابه
Approximate Nearest Neighbour Search with the Fukunaga and Narendra Algorithm and Its Application to Chromosome Classification
The nearest neighbour (NN) rule is widely used in pattern recognition tasks due to its simplicity and its good behaviour. Many fast NN search algorithms have been developed during last years. However, in some classification tasks an exact NN search is too slow, and a way to quicken the search is required. To face these tasks it is possible to use approximate NN search, which usually increases e...
متن کاملApproximate Nearest-Neighbour Search with Inverted Signature Slice Lists
In this paper we present an original approach for finding approximate nearest neighbours in collections of locality-sensitive hashes. The paper demonstrates that this approach makes high-performance nearest-neighbour searching feasible on Web-scale collections and commodity hardware with minimal degradation in search quality.
متن کاملA New Formulation for Cost-Sensitive Two Group Support Vector Machine with Multiple Error Rate
Support vector machine (SVM) is a popular classification technique which classifies data using a max-margin separator hyperplane. The normal vector and bias of the mentioned hyperplane is determined by solving a quadratic model implies that SVM training confronts by an optimization problem. Among of the extensions of SVM, cost-sensitive scheme refers to a model with multiple costs which conside...
متن کاملSoftware Cost Estimation by a New Hybrid Model of Particle Swarm Optimization and K-Nearest Neighbor Algorithms
A successful software should be finalized with determined and predetermined cost and time. Software is a production which its approximate cost is expert workforce and professionals. The most important and approximate software cost estimation (SCE) is related to the trained workforce. Creative nature of software projects and its abstract nature make extremely cost and time of projects difficult ...
متن کاملSNN: A Supervised Clustering Algorithm
In this paper, we present a new algorithm based on the nearest neighbours method, for discovering groups and identifying interesting distributions in the underlying data in the labelled databases. We introduces the theory of nearest neighbours sets in order to base the algorithm S-NN (Similar Nearest Neighbours). Traditional clustering algorithms are very sensitive to the user-defined parameter...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017